A New Memetic Algorithm for Multi-document Summarization Based on CHC Algorithm and Greedy Search

نویسندگان

  • Martha Mendoza
  • Carlos Alberto Cobos Lozada
  • Elizabeth León-Guzmán
  • Manuel Lozano
  • Francisco J. Rodríguez
  • Enrique Herrera-Viedma
چکیده

Multi-document summarization has been used for extracting the most relevant sentences from a set of documents, allowing the user to more quickly address the content thereof. This paper addresses the generation of extractive summaries from multiple documents as a binary optimization problem and proposes a method, based on CHC evolutionary algorithm and greedy search, called MA-MultiSumm, in which objective function optimizes the lineal combination of coverage and redundancy factors. MA-MultiSumm was compared with other state-of-the-art methods using ROUGE measures. The results showed that MA-MultiSumm outperforms all methods on the DUC2005 dataset; and on DUC2006 the results are very close to the best method. Furthermore in a unified ranking MA-MultiSumm only was improved on by the DESAMC+DocSum method, which requires as many iterations of the evolutionary process as MA-MultiSumm. The experimental results show that the optimization-based approach for multiple document summarization is truly a promising research direction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

A new memetic algorithm for mitigating tandem automated guided vehicle system partitioning problem

Automated Guided Vehicle System (AGVS) provides the flexibility and automation demanded by Flexible Manufacturing System (FMS). However, with the growing concern on responsible management of resource use, it is crucial to manage these vehicles in an efficient way in order reduces travel time and controls conflicts and congestions. This paper presents the development process of a new Memetic Alg...

متن کامل

Detecting communities of workforces for the multi-skill resource-constrained project scheduling problem: A dandelion solution approach

This paper proposes a new mixed-integer model for the multi-skill resource-constrained project scheduling problem (MSRCPSP). The interactions between workers are represented as undirected networks. Therefore, for each required skill, an undirected network is formed which shows the relations of human resources. In this paper, community detection in networks is used to find the most compatible wo...

متن کامل

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

MMDT: Multi-Objective Memetic Rule Learning from Decision Tree

In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014